ECE 5725 Final Project: Cole Gilbert (nsg68) and Audrey Sackey (aas332)
Introduction
Inspired by the concept of gesture-controlled technology, our project aims to design an automatic selfie
camera, which takes photos in response to a handclap, by directing the camera in the direction of the handclap.
The system makes use of the picamera, piTFT and two microphones for audio input and signal processing. Our
design comes in two parts– the piTFT user display, which comprises the UI and all its menu items, and then the
controls setup hich consists of two usb microphones, the picamera, and the servos which control the camera.
Objective
The objective of this project is to streamline the process of taking selfies through the integration of modern
gesture recognition technology with user-friendly interfaces. By utilizing the capabilities of the Raspberry Pi
ecosystem, we aim to create an intuitive and seamless user experience that eliminates the need for physical
interaction with the camera. This project not only simplifies the selfie-taking process but also explores the
practical applications of sound localization and gesture control in consumer electronics. The end goal is to
deliver a robust and reliable system that responds accurately to the user’s gestures, providing an innovative
and fun way to capture moments with ease. Also, it’s just really cool to have a camera point directly to you and
take your photos when you clap! Below is a diagram of our initial project:
Design and Testing
Capturing the audio input
For accurate signal processing, we connected two microphones to the Raspberry Pi, which capture input from the
surroundings for processing. The microphones record the input and save it to a .wav file. The purpose of using
two microphones is accuracy, and for directional purposes. We then read the frames from this file and perform an
amplitude measurement analysis to distinguish a handclap sound from the ambient noise. Through testing we found
that the
amplitude of handclaps ranges from about 10,000 to 30,000. Since we are using 16-bit PCM, the
audio range for the mics is from -32,768 to 32,767. Therefore, relative to the max PCM value, our handclaps are
near the peaks, which is expected. To set reasonable thresholds for handclap detection, we compared the peaks of
handclaps to components of ambient
noise, such as human speech. Specifically, we compared four handclaps to four spoken words: “Ladies and
gentlemen, welcome.”
As shown in the images, the handclap amplitudes are distinct and fall within the range of tens of thousands,
while human speech amplitudes are within the range of 0-5000. This clear distinction between handclap amplitudes
and ambient noise justified our method of filtering using amplitude.
Our program reads the frames from the file continuously every second to ensure we don't miss a handclap and to
enable fast, streamlined signal processing. The output .wav file is updated every second as well to incorporate
new audio input.
Using the input to control the motors
We established a threshold of 15,000 for detecting a handclap, through repeated trials of hand claps. Although
the image above representing the
handclap peaks indicate some handclap peaks may fall slightly below this threshold, these were typically
associated with very weak handclaps. We chose to optimize our system to detect strong handclaps to effectively
filter out environmental noise that might include weaker claps.
To determine the direction of the handclaps, we used two microphones. While both microphones receive the same
input, their amplitude readings can vary based on their proximity to the handclap source. Our program analyzes
these amplitude differences to infer the direction of the sound.
Through extensive testing we figured out that if the amplitude differences between the microphones are minimal,
that is, between 400 and 7000, we infer that
the handclap source is nearly equidistant from both microphones, and the servos are programmed to keep the
camera centered. When the left microphone records a significantly higher amplitude, outside this range, the
program directs the servo to move the camera to the left. Conversely, if the right microphone records a higher
amplitude, the servo moves the camera to the right.
This method ensures accurate and responsive camera positioning.
Taking a picture
Our program is designed to take a picture after the servos have performed specific movements. For instance, in
a left-turn movement, the horizontal servo first rotates the camera to the left, and then the vertical servo
housing the PiCamera tilts downward to position the camera towards the subject. Once the camera is correctly
positioned, the PiCamera is accessed to capture the photo. After the picture is taken, the vertical servo tilts
back up, and both servos return to the center position.
The PiCamera interfaces seamlessly with the rest of the system through the Raspberry Pi. The control logic for
the servos and the PiCamera is managed by a Python script running on the Raspberry Pi. This script uses
libraries such as ‘RPi.GPIO’ for servo control and picamera for interfacing with the PiCamera. One key thing we
ensure is that there is a reasonable delay during the entire motion and picture-taking process to avoid blurred
photos. This delay allows the servos to stabilize and the PiCamera to focus properly, ensuring clear and sharp
images. This integrated approach ensures that the system operates smoothly and efficiently, providing a seamless
user experience.
Sending an email
One key element of our selfie camera was ensuring our users could have access to the photos that they took on
the device. We implemented this with a program that would email the user an image that they selected out of the
many images they took. To get this to work we first made an email address for the project:
“ece5725selfiecam@gmail.com”. We then used an SMTP (Simple Mail Transfer Protocol) server to send emails via
python code - making sure to attach the selected image.
User interface
The glue that connects all of these components together is the user interface, which is a combination of the
PiTFT display, the PiTFT touchscreen functionality, and the various possible states for the automatic selfie
camera. We used PyGame, Pigame, and PiTFTTouchscreen extensively to allow for touch interactions with the
screen. Every touch was registered and based on the location and the current state, would have a different
action. To test all of the state transitions, we made a state transition diagram and utilized lots of print
statements to see where each button press would take the user.
The first state is the home screen which features the start and quit buttons, these allow the user to
progress to the instructions screen, or to quit the program. This state serves as a home base of sorts for
the
automatic selfie camera.
The second state is the instructions screen which includes instructions for how to operate the selfie
camera. It also has four buttons: library, email, camera, and back. The back button just sends the user to
the
previous state.
The third state is the library state, reachable by pressing the library button on the instructions screen.
This state includes the complete photo library, which allows users to progress through all the photos that
have been taken during this instantiation of the automatic selfie camera. Users will press the ‘prev’ button
to see the previous image, the ‘next’ button to see the next image, and the select button to select their
favorite photo and return to the instructions screen. Note that the photo library is cyclic, i.e. if you
reach
the last photo and press ‘next’ it will take you back to the first photo. We are able to navigate through
all
the photos that have been taken thus far via a clever naming system: each photo is saved as “selfieX.jpg”
where X is a number. This number starts at 0 and increases every time you take a photo, so when you have
taken
5 photos, you would have: selfie0.jpg, selfie1.jpg, selfie2.jpg, selfie3.jpg, and selfie4.jpg stored in the
library. The ‘prev’ and ‘next’ buttons that progress through the library simply increment or decrement a
counter (wrapping it around when it is less than 0 or equal to the number of images in the library) and this
counter is then concatenated to the end of a string that flashes the corresponding image onto the screen.
The
select button simply saves the selected image’s index and returns the user to the instructions page, similar
to the back button.
The fourth state is the email state, reachable by pressing the email button on the instructions screen.
This
state simply includes a back button (that sends the user to the previous state) and a text box where you can
enter in your email using the attached keyboard. Once you have entered in a valid email address, pressing
enter will send the email address a message with the selected photo and it will return the user to the home
screen.
The fifth and final state is the camera state, reachable by pressing the camera button on the instructions
screen. The camera state only has a back button but it activates the sound locating camera. In this state,
anytime you clap, the camera will angle towards the clap, take a photo, and save it to the library.
To visualize these transitions, please see the demo portion of our project’s video and the state transition
diagram below:
Issues encountered
Although we successfully completed the project, we encountered a notable challenge concerning microphone
selection. Initially, we employed a pair of non-USB microphones coupled with a bandpass filter circuit. This
setup interfaced with an analog-to-digital converter before connecting to the Raspberry Pi. We faced
difficulties in obtaining accurate inputs and observed minimal change in audio signals upon handclap
detection.Given that our project's essence relied on discerning handclaps amidst ambient noise, it became
apparent the mics needed to be changed. Despite experimenting with alternative microphones, we realized that
our
bandpass filter might have been the root cause of our problems. We discarded the filter and decided to do our
frequency filtering on the software end. This still did not fix our issue. Ultimately, we transitioned to USB
microphones, a decision that significantly streamlined our setup and provided clear and easily interpretable
data. Below is a diagram of our original bandpass filter:
We also encountered a major issue with sending emails from a custom email account. We had trouble getting
around two-factor authentication for the email accounts we created, which is standard for all accounts (we
tried
gmail, aol, and yahoo). Through extensive research we were able to find that gmail supports 3rd party app sign
in, which allowed us to pass a specific key that could verify we were using a trusted service without having
to
authenticate it.
Despite these setbacks, the remainder of our project progressed smoothly. This experience emphasized the
importance of meticulous hardware selection and iterative problem-solving during the development process.
Results
Here is a brief overview of how all the different parts of our finalized automatic selfie camera's system
interact
with each other:
User Interface: selects the state which determines what actions are able to be done at that moment.
Microphone Input: The microphones capture audio signals, which are processed to detect a handclap and
determine its direction.
Servo Movement: Based on the direction inferred from the audio signal, the servos adjust their positions
to orient the camera towards the source of the sound.
PiCamera Activation: Once the servos have moved to the desired position, the Python script commands the
PiCamera to take a picture.
Servo Reset: After the picture is taken, the servos return to their default positions to be ready for the
next handclap
Email Integration: Once the user selects an image and enters their email address, an email is sent from a
custom email address: “ece5725selfiecam@gmail.com”.
We used three main codes to control our entire project: start_screen.py (Figure 1), mic_loop.sh (Figure 2), and
start_proj.sh (Figure 3). These codes more broadly controlled all of the actions and tasks listed previously.
start_screen.py
Controls user interface visual display
Controls user interface state changes
Controls the camera assembly (servo motors and the picamera)
Takes photos and saves them in a directory
Sends emails with selected photos
Processes audio from .wav files
mic_loop.sh
Polls microphones for audio every second and updates .wav files
start_proj.sh
Clears photo library
Runs mic_loop.sh in the background
Runs start_screen.py in the foreground
The project's objective was to deliver a robust and reliable selfie camera system that accurately responds
to user gestures, providing an innovative and fun way to capture moments with ease. We are proud to have
achieved this goal despite encountering various challenges along the way. Collaborating to solve these
issues encouraged teamwork, helped us develop problem-solving strategies, and improve our Python code
debugging and Linux troubleshooting skills. Overall we consider this project a very successful endeavor both
for the outcome of the automatic selfie camera and for ourselves as engineers.
Conclusions and Acknowledgements
Overall, we are satisfied with what the automatic selfie camera, and glad we were able to bring our idea to
life successfully. Our system is able to successfully detect handclaps, and take a picture in the direction of
the handclap. take The entire process of working of this project was eye opening, and insightful. One noteworthy
thing we learned was that introducing analog signals for processing over GPIO pins, even with the help of an ADC
isn’t always effective. Despite our setbacks, we consider this project a very successful endeavor both for the
outcome of the automatic selfie camera and for ourselves as engineers.
We extend our heartfelt gratitude to Professor Joe Skovira for his guidance and support throughout the
project. His expertise and encouragement played a pivotal role in shaping our ideas and overcoming
obstacles. We also want to acknowledge the invaluable assistance provided by the teaching assistants, whose
feedback and support were instrumental in our project's success.
Future Work
If we had more time to work on the project, we would have pursued a more robust framework for our directional
movement. Our current system only detects handclaps in three directions–left, right and center and consequently,
the camera only moves in those three directions. Implementing a more fine tuned directional movement over a wide
range of angles would be a great extension of our project.
Division of Labor
Audrey and Cole both contributed equally to this project, however they worked on tasks that suited their
strengths. Audrey wired the entire system together, worked with several different kinds of microphones, found
the desired thresholds for the clap amplitude, and was a positive influence on the team in general. Cole worked
on the user interface, the servo directionality calculations for locating the clap, and he developed the
website.
Bill of Materials/Parts List
Raspberry Pi 4, 2G Ram
Capacitive piTFT
SD card
Raspberry Pi power supply charger
2 grove microphones
1 motor
1 Camera
3 usb microphones
2 breadboards
1 arduino uno
5 adafruit microphones
1 ads1115 adc
No parts were purchased. All these were acquired from the lab, with the exception of the adc, which was
provided by a team member.